Blind Model Selection for Automatic Speech Recognition in Reverberant Environments
نویسندگان
چکیده
This communication presents a new method for automatic speech recognition in reverberant environments. Our approach consists in the selection of the best acoustic model out of a library of models trained on artificially reverberated speech databases corresponding to various reverberant conditions. Given a speech utterance recorded within a reverberant room, a Maximum Likelihood estimate of the fullband room reverberation time is computed using a statistical model for short-term log-energy sequences of anechoic speech. The estimated reverberation time is then used to select the best acoustic model, i.e., the model trained on a speech database most closely matching the estimated reverberation time, which serves to recognize the reverberated speech utterance. The proposed model selection approach is shown to improve significantly recognition accuracy for a connected digit task in both simulated and real reverberant environments, outperforming standard channel normalization techniques.
منابع مشابه
Model-based blind estimation of reverberation time: application to robust ASR in reverberant environments
This paper presents a method for blind estimation of reverberation times in reverberant enclosures. The proposed algorithm is based on a statistical model of short-term log-energy sequences for echo-free speech. Given a speech utterance recorded in a reverberant room, it computes a Maximum Likelihood estimate of the room full-band reverberation time. The estimation method is shown to require li...
متن کاملA Two-Channel Acoustic Front-End for Robust Automatic Speech Recognition in Noisy and Reverberant Environments
An acoustic front-end for robust automatic speech recognition in noisy and reverberant environments is proposed in this contribution. It comprises a blind source separation-based signal extraction scheme and only requires two microphone signals. The proposed front-end and its integration into the recognition system is analyzed and evaluated in noisy living room-like environments according to th...
متن کاملAn MTF-based blind restoration of temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments
To reduce speech degradation in reverberant environments, we previously proposed a modulation transfer function (MTF) based method of speech restoration. The room impulse response (RIR) in this restoration does not need to be measured at any time since we modeled the power envelope of the RIRs as an exponential decay function. Speech is assumed to be temporal modulated with white noise carrier ...
متن کاملBlind Deconvolution for Multi-microphone Speech Dereverberation: Application to Asr in Reverberant Environments
In this paper, a deterministic time-domain algorithm for multichannel blind deconvolution is presented. The proposed algorithm assumes that a source signal is measured by several sensors after propagating through finite impulse response channels and being corrupted by additive noise. An estimate of the source signal is obtained by first estimating the channel impulse responses in a Least Square...
متن کاملA Combined Approach for Estimating a Feature-domain Reverberation Model in Non-diffuse Environments
A combined approach for estimating a feature-domain reverberation model suitable for the robust distant-talking automatic speech recognition concept REMOS (REverberation MOdeling for Speech recognition) [1] is proposed. Based on a few calibration utterances recorded in the target environment, the combined approach employs ML estimation and blind estimation of the reverberation time to determine...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- VLSI Signal Processing
دوره 36 شماره
صفحات -
تاریخ انتشار 2004